BLOCK TENSORS AND SYMMETRIC EMBEDDINGS 
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Abstract. Well known connections exist between the singular value decomposition of a matrix 
A and the Schur decomposition of its symmetric embedding sym(j4) = ([OA; 0]). In particular, 
if (T is a singular value of A then +a and —a are eigenvalues of the symmetric embedding. The top 
and bottom halves of sym(A)'s eigenvectors are singular vectors for A. Power methods applied to 
A can be related to power methods applied to sym(A). The rank of sym(j4) is twice the rank of 
A. In this paper we develop similar connections for tensors by building on L-H. Lim's variational 
approach to tensor singular values and vectors. We show how to embed a general order-d tensor A 
into an order-d symmetric tensor sym(.A). Through the embedding we relate power methods for A's 
singular values to power methods for sym(yl)'s eigenvalues. Finally, we connect the multilinear and 
outer product rank of A to the multilinear and outer product rank of sym(^). 
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1. Introduction. If AeR"^^"^, then there are well-known connections be- 
tween its singular value decomposition (SVD) and the eigenvalue and eigenvector 
properties of the symmetric matrix 



A 




sym(A) = 
If A = f/SV^ is the SVD of A, then for k 



A 




Uk 



where Uk = U{:,k), Vk = V{:,k), and ak = 
sym(A) is through the Rayleigh quotients 



G J^("l+"2)x(ni+Ti2) 

l:rank(A) 



Uk 
±Vk 



(1.1) 



(1.2) 



S(fc, fc). Another way to connect A and 



(j)A{u,v) 
and 



u Av 



= 51 XI Mn,i2)u{ii)v{i2) 



\ii = l 12 = 1 




1 x^Cx 



^ / N N 

- I ^ ^ C{ii,i2)x{ii)x{i2 

\ii=l 12 = 1 




Ivh) (1-3) 



(1.4) 



where u e R"^ , v e M"^ , N = ni+n2, a; e H , and C ~ sym(A). If a; is a stationary 
vector for <p^^^"^\ then u — x{l:ni) and v — x{ni + l:ni + 77,2) render a stationary 
value for (I) a- See [51 p. 448]. 

In this paper we discuss these notions as they apply to tensors. An order-d ten- 
sor A G j^"ix - x"£i ig a, real d-dimensional array A{l:ni, . . . , lin^) where the index 
range in the fc-th mode is from 1 to n^. The idea of embedding a general tensor into 
a larger symmetric tensor having the same order is developed in §2. This requires 
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having a facility with block tensors. Fundamental ordcrings, unfoldings, and multi- 
linear summations are discussed in §3 and used in §4 where we characterize various 
multilinear Rayleigh quotients and their stationary values and vectors. This builds on 
the variational approach to tensor singular values developed in |15j . In §5 wc provide 
a symmetric embedding analysis of several higher-order power methods for tensors 
that have recently been proposed [101 El El El IE] ■ Results that relate the multilin- 
ear and outer product ranks of a tensor to the corresponding ranks of its symmetric 
embedding are presented in §6. A brief conclusion section follows. 

Before we proceed with the rest of the paper, we use the case of third-order 
tensors to preview some of the main ideas and to establish notation. (The busy 
reader already familiar with basic tensor computations and notation may safely skip 
to §2.) The starting point is to define the trilinear Rayleigh quotient 

( "1 "2 "3 \ / 

'^^^A{ii,i2,i3)u(ii)v{i2)w{i3)\ / (|| u || II2 || ii' jl2) 
il = l 22 = 1 43 = 1 / / 

(1.5) 

where A G ]R"l''"^''"^^t e w e IR"^ and w e IR"^ Calligraphic characters are 
used for tensors: A{ii,i2,i3) is entry (ii, j2,*3) of A. 

The singular values and vectors of A are the critical values and vectors of (/)_4 
as formulated in |15| . A simple expression for the gradient \7(t>A is made possible 
by unfolding A ~ [o-ijk) in each of its three modes and aggregating the u, v, and 
w vectors with the Kronecker product. To illustrate, suppose ni = 4, 712 = 3, and 
na = 2 and define the modal unfoldings Ai^i), -4(2) j and .4(3) by 



All 



(1) 



-4(2) = 



am 0121 1x31 aii2 0122 0132 

^211 ^221 ^231 ^212 ^222 1232 

^311 ^321 ^331 ^312 ^322 ^332 

0411 ^421 ^431 ^412 ^422 0432 



Olll 0211 O311 0411 0112 O212 O312 O412 
O12I O22I O321 O421 O122 O222 O322 O422 
O131 0231 O331 0431 ai32 O232 O332 O432 



(1.6) 



A 



(3) 



0111 O211 O311 a4ii ai2l O221 O321 O421 O131 0231 0331 0431 

0112 O212 O312 O412 O122 O222 O322 O422 O132 O232 O332 O432 



The columns of these matrices are fibers. A fiber of a tensor is obtained by fixing all 
but one of the indices. For example, the third column of the unfolding 

^(1) = [ ^(:,1,1) ^(:,2,1) ^(:,3,1) ^(:,1,2) ^(:,2,2) ^(:,3,2)] 

is the fiber 



A:, 3,1) = 



Al,3,l) 
^(2,3,1) 
^(3,3,1) 
^(4,3,1) 



obtained by fixing the 2-mode index at 3 and the 3-modc index at 1. It is necessary 
to specify the order in which the fibers appear in a modal unfolding. The choice 
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exhibited in (1.6) has the property that 



m n2 "3 



^^^Aiii,i2,i3)uiii)v{i2)w{i3) = I v'^ A(^2) w u (1.7) 



il = l 12 = 1 23 = 1 



which makes it easy to specify the stationary vectors of (f>j\,. If u, v, and w are unit 
vectors, then the gradient of 0^ is given by 



V0^(u, u,u;) 



A(3) v<E)u 



u 

V 

w 



We remark that if A is an order-2 tensor, then (1.8) collapses to the familiar matrix- 
SVD equations Av ~ au and A'^u ~ av. 

A central contribution of this paper revolves around the tensor version of the sym 
matrix (1.1) and the associated Rayleigh quotient 0^^™-* that is defined in (1.4). Just 
as sym-of-a-matrix sets up a symmetric block matrix whose entries are either zero 
or matrix transpositions, sym-of-a-tensor sets up a symmetric block tensor whose 
entries are either zero or a tensor transposition. 

U Ae ll"iX"2X"3^ then there arc 6 = 3! possible transpositions identified by the 
notation A^ '"I ^ where [i j k] is a permutation of [12 3]: 



B 



f ^< [1 2 3] > 
_4< [1 3 2] > 
j[< [2 1 3] > 
_4< [2 3 1] > 
j[< [3 1 2] > 
A< [3 2 1] > 



S(z,j,fc) 
B{z,k,j) 
B{j,i,k) 
B{j,k,i) 
B{k,i,j) 
B{k,j,i) 



AiiJ,k) 



(1.9) 



for i = l:ni, j = l:n2, k = I'.n^. 

The symmetric embedding of a 3rd-order tensor results in a 3-by-3-by-3 block ten- 
sor, a kind of Rubik's cube built from 27 (possibly non-cubical) boxes. If ^ G ]F!,"^^"^^'^ 
and N = ni+n2 + n3, then syni{A) = C G ^ ^ is the 3-by-3-by-3 block tensor 
whose ijk block is specified by 

r _4< H j k] > jf j ^] ig a permutation of [1 2 3] 

Ci.jk] = { (1.10) 
[ e 11"-''"^ ''"'= otherwise. 



See Fig 1.1. The blocks in a block tensor such as C can be specified using the colon 
notation. For example, if ni = 4. 712 = 3 and = 2, then 
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C(:,:,3) 




Fig. 1.1. The Symmetric Embedding of an Order-3 Tensor 



C[l 2 3] 


= C(l:4,5:7,8:9) 


^ _4< [1 2 3] > 




C[l 3 2] 


= C(l:4,8:9,5:7) 


^ _4< [1 3 2] > 


€ R"^ XT13 Xn2 


C[2 1 3] 


= C(5:7,l:4,8:9) 


^ _4< [2 1 3] > 


G R"^ xni Xn3 


C[2 3 1] 


= C(5:7,8:9,l:4) 


^ [2 3 1] > 


G R"^ XTl3 xni 


C[3 1 2] 


= C(8:9,l:4,5:7) 


^ _4< [3 1 2] > 




C[3 2 1] 


= C(8:9,5:7,l:4) 


^ [3 2 1] > 


G R^'^ X 712 X ^11 



(1.11) 



We will prove in section 2.3 that the tensor C is in fact symmetric. 

The last topic to cover in our order-3 preview is the generalization of the Rayleigh 
quotient 0^'^™^ defined in (1.4). If G R"iX"2X"3^ c = sym{A), N^ni+n2+ ng, 
and X G R^, then (/)^^™'' is defined by 



12 — 1 23 

It will be shown in section 4.3 that if 



^ / N N N \ 

^ I ^ ^ C{ii,i2,i3)x{ii)x{i2)x{i3) 

\il=l 12 = 1 23 = 1 / 



(1.12) 





u 


}ni 


X ~ 


V 


}n2 




w 


}n3 



satisfies Vc/)^^™'' [x 



0, then 



VyCj)A{u,V,w) 



<t>A{u,v,w) = 
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where \^ refers to the gradient with respect to the components in vector z. Moreover, 
it wiU be shown that 



u 

V 

-w 



u 

-V 

w 



u 

-V 

-w 



are also stationary vectors for 



and 



Asym) 



Asym) 



2. The Symmetric Embedding. Block matrix manipulation is such a fixture 
in numerical linear algebra that we take for granted the correctness of facts like 



All Ai2 
A21 A22 



A 



T 
11 


^21 


T 
12 


^22 



(2.1) 



Formal verification requires showing that the {i, j) entries on both sides of the equation 
are equal for all valid ij pairs. 

The symmetric embedding of a tensor involves generalizations of both transposi- 
tion and blocking so this section begins by discussing these notions and establishing 
the tensor analog of (2.1). Since vectors of subscripts are prominent in the presenta- 
tion, we elevate their notational status with boldface font, e.g., p = [412 3]. We let 
1 denote the vector of ones and assume that dimension is clear from context. More 
generally, if TV is an integer, then N is the vector of all A^'s. Finally, if i and j have 
equal length, then i < j means that ik < jk for all k. 

2.1. Blocking. If s and t are integers with s < t, then (as in Matlab) let s:t 
denote the row vector [s, s -I- 1, • • • ,t]. We refer to a vector with this form as an index 
range vector. The act of blocking an mi-by-m2 matrix C is the act of partitioning 
the index range vectors l:mi and l:m2'. 



l:mi 



.(1) 



,(2) 



l:m2 



.(2) 



(2.2) 



Given (2.2), we are able to regard C as a 61 x 62 block matrix (C^^.i^) where block 
Cij^ij has length(r^^'') rows and length(rj-^'') columns. It is easy (although messy) 
to "locate" a particular entry of a particular block. Indeed, 

C.,,.,(J1,J2) = C{f^^ +Ji, p'i^ +J2) 



where 



(fc) 



length(rf ^) + length(r^'''^) 



length(r,(fLi) 



(2.3) 



for k = 1:2. 

To block an order-d tensor C G j^'"!^ "^™'' t^jq proceed analogously. The index- 
range vectors l:mi, . . . , l:md are partitioned 



l:mfc 



l:d 



(2.4) 



and this permits us to regard C as a 61 x • • • x 6^ block tensor. If i = [ii, . . . then 
the i-th block is the subtensor 
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If j = [ji , . . . , jd] , then the j-th entry of this subtensor is given by 

Ci(j) = Cif^ +n,..., +j^eR (2.5) 

where p^^"* is specified by (2.3) for A: = 

To illustrate equations (2.3)-(2.5), if C € Rfx^xSxe ^^^^ 

1:9 = [ 1 2 3 4 I 5 6 I 7 8 9 ] (6i = 3) 

1:7 =[ 1 2 3 4 5 I 6 7 ] {h = 2) 

1:5 1 2 3 4 I 5 ] (63 = 2) ' 

1:6 =[ 1 2 I 3 4 I 5 6 ] (^4 = 3) 

then we are choosing to regard Casa3x2x2x3 block tensor. Thus, if i = [3121] 
then Ci = C(7:9, 1:5, 5:5, 1:2) and 

Ci(j) = C(6 + ji, j2,4 + j3, J4) 

where 1 < j < [3 5 12]. 

2.2. Tensor Transposition. U Ae ]R"ix •■x«d and p = [pi,-.- ,Pd] is a per- 
mutation of l:d, then y4,<P> e ]R"pi x - xnp^ denotes the p-transpose of A defined by 

-4<P>(jpi,...,jpJ = Aiji,...Jd) 

where 1 < jfc < nk for k ~ l:d. A more succinct way of saying the same thing is 

-4<P>(j(p))=^(j) l<j<n. 

If A is an order-2 tensor, then A^ ^1 ^(j2, Ji) = A{ji,j2)- It is also easy to verify 
that if f and g are both permutations of l:d, then 

(^<f>)<g> ^ ^<f(g)>. (2.6) 

A transposition of a block tensor renders another block tensor. The following 
lemma makes this precise and generalizes (2.1). 

Lemma 2.1. Suppose C G ]pj™ix - x™£i is a bi x ■ ■ ■ x bd block tensor with block 
dimensions defined by the partitioning (2.4)- Let d denote its i-th block where i = 
[ii, . . . , id]. Ifp= [pi, . ■ . ,pd] is a permutation of l:d and B ~ C^^^ , then the tensor 
B G H™''! x - XTOp^ is a bp-^ X ■ ■ ■ X bp^ block tensor where each block i3i(p) is defined by 
Si(p)-C<''>. 

Proof. If 1 < jk < rnk for k ~ l:d, then from (2.4) and (2.5) we have 

^i^^^ Upi : ■ • ■ : ipd ) Ci (jl , . . . , Jp) = C {p'l^^^ + jl , . . . , p[f + id) 

On the other hand, B = C^p^ and so 

C(pW+ji,...,pif +j,) ^ B{p[ll^+jp„...,p[l^J+jpJ = 6i(p)(j,,,...,j,J. 
Thus, Si(p)(j(p)) = Ci<P>(j(p)) for all j , i.e., B^p^ = C^^^ . □ 
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2.3. The sym(-) Operation. An ordcr-d tensor C e -j^Nx - xN symmetric 
if C = C^P^ for any permutation p of l:d. The tensor analog of (1.1) involves 
constructing an ordcr-d symmetric tensor sym(^) whose blocks are either zero or 
carefully chosen transposes A. In particular, if ^ S j^^i^ "^"-d^ then 

sym(yt) gR^^-''^ N^ni + ---nd 

is a block tensor defined by the partitioning l:iV = [ ri | • • • | ] where 

rfc = (l + ni + --- + ?7,fc_i):(ni + ---+nfc) k ^ l:d. (2.7) 

The i-th block of C = sym(^) is given by 

1 > if i is a permutation of l:d 

otherwise 



Ci = 



for all i that satisfy 1 < i < d. Note that Ci is x n.i^ x • • • x n^^. We confirm that 
sym(^) is symmetric. 

Lemma 2.2. If A e ]R"i^--^"d and C = sym.{A), then C is symmetric. 

Proof. Let p be an arbitrary permutation of l:d. We must show that ii B = p ^ 
then B =C. Since C as a block tensor is dxdx- • - xd, it follows from Lemma 2.1 that 
B has the same block structure and 

Si(p) = C5P> 

for all i that satisfy 1 < i < d. If i is a permutation of ltd, then C\ = and by 

using (2.6) we conclude that 

Si(p) = (^<'>)<P> = A<'^^> = Ci(p) 
If i is not a permutation of l:d, then both Ci and Ci(p) are zero and so 

Si(p) =C5P> = =Ci(p). 

Since B and C agree block-by-block, they are the same. □ 

3. Orderings, Unfoldings, and Summations. In numerical multilinear alge- 
bra it is frequently necessary to reshape a given tensor into a vector or a matrix and 
vice versa. In this section we collect results that make these maneuvers precise. 

3.1. The col Ordering. If i and s are length-e index vectors and 1 < i < s, 
then we define the integer-valued function ivec by 

ivec{i, s) = ii + (^2 - l)si H h (ie - l)(si ■ • • Se-i) 

IfJ" G M'*^^"^*", then v = vec(J') G H''^''*" is the column vector defined by 

v{ivec{i, s)) = 1 < i < s. 

Note that if e = 2, then is a matrix and vec(J^) stacks its columns. We also observe 
that if Wk S IR'''' for k = l:e, then 

W — We®---®Wi <4> w{ivec{i,s)) ~ Wi{ii) ■ ■ ■ We{ie)- (3.1) 



8 



S. RAGNARSSON and C.F. VAN LOAN 



3.2. Modal Unfoldings. In the gradient calculations that follow, it is particu- 
larly convenient to "flatten" the given tensor A E ]p^"ix •x"ti j^^Q ^ matrix. If 



n=[n(l:A:-l) n{k + l:d)], (3.2) 

i=[i(l:fc-l) i(fc + l:d)], (3.3) 

then the mode-k unfolding A(k) is defined by 

^(fc)(ifc,iwec(i,n)) = ^(i) 1 < i < n. (3.4) 



This matrix has rows and ni ■ ■ ■ rik-ink+i ■ ■ ■ rid columns. A third-order instance 
of this important concept is displayed in equation (1.6). We mention that there are 
other ways to order the columns in A(^ky See |14| . 

While the columns of A(^k) mode-Zc fibers, its rows are reshapings of its mode-k 
subtensors. In particular, it 1 < r < Uk, then 

^(fc)(r,:) = vec(SW)^ 
where the mode-fc subtensor i?'''-' has order d — 1 and is defined by 

B'-''\ii, . . . ,ik-i,ik+i, ■ ■ ■ ,id) = A{ii, . . . ,ik-i,r,ik+i, ■ ■ ■ ,id)- 

The partitioning of an order-d tensor into order-((i— 1) tensors is just a generalization 
of partitioning a matrix into its columns. 

3.3. Summations. It is handy to have a multi-index summation notation in 
order to describe general versions of the summations that appear in (1.5) and (1.12). 
If n is a length-d index vector, then 

n Til rid 

E--E- 

i=l Ji = l id = i 

The summation that defines the multilinear Rayleigh quotient (1.5) can be written in 
matrix- vector terms. 

Lemma 3.1. If A e ]R"ix-x"<* and Uk e M"" for k = l:d, then 

n 

'^A{i)ui{ii)---Ud{id) = vec{Afud<» ■ ■ ■ <E) m. (3.5) 

i=l 

Moreover, for k ^ l:d we have 

n 

^A{i)ui{ii)---Ud{id) = ulA(k)Uk (3.6) 

i=l 

where 

Uk = {ud® ■■■ ® Uk+i ® Uk^i ® • • • ® ui). (3.7) 
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Proof. If a = vec(^) and b — Ud 
equations (3.1)- (3.4), wc have 



1 Wi, then using the definition of vec and 



n n n 

''^A{i)ui{ii) ■ ■ ■ Ud{id) = ^ a(i?;ec(i, n)) • 6(wec(i, n)) = ^ a(i)5(i) = a^b. 

i=l i=l i=l 

This proves (3.5). Using the modal subtcnsor interpretation oi A(^k} that we discussed 
in §3.2 and definitions (3.2) and (3.3), we have 

^^(i)7.l(zi)---7/d(*rf) = ^ 7.fc(zfc) ^S(^'')(i)7i(i) 
i=l jfc = l \i=l 

= ^Uk{ik){A(k){ik,--)uk) = ulA(k)Uk 



which estabUshes (3.6). □ 

Summations that involve symmetric tensors are important in later sections. The 
following notation for the multiple Kronecker product of a single vector is handy: 



d times 

Note that if .T e ]R^, then .t®'' e 

Lemma 3.2. //C e R^'" '''^ is a symmetric order-d tensor and x G R^, then 

N 

Y,C{i)x{z,)---x{id) = .T^C(i)X«(''-i) (3.8) 

i=l 

Proof. This follows from Lemma 3.1 by setting — N and Uk — x for k — l:d. 
Note that because C is symmetric, C(i) = • • . = Cj^j. □ 

The summation (3.8) has a special characterization if C = sym(^). To pursue 
this we will have to navigate Cs block structure and to that end we define the index 
vectors L and R as follows: 



1 

ni + 1 



R 



"1 

ni + n2 



Hi 



ni H h Ud-i + 1 

Note that if 1 < p < d, then 

Cp = C(L(pi):R(pi),...,L(pd):R(pd)) 

is Cs p-th block. 



nd 



(3.9) 



Lemma 3.3. Suppose A 



,ni X ■ ■ ■ X rid 



X € is partitioned as follows 



Ul 



C = sym(^), and N = ni + ■ ■ ■ + rid- If 



Uk e R"^ 



Ud 
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and Ui, . . . ,Ud are defined by (3.7), then 



Cfijx®^'*-!) = {d-iy. 



(3.10) 



and 



N 



j=l i=l 



(3.11) 



Proof. If V = C(i)a;«^(''-i' and 



Wl 



Wd 



(3.12) 



is partitioned conformally with x, then for j = l:iV wc have 

N 

v{j)= ^ C{j,i2,...,id)x{i2)---x(id) 

i(2:d) = l 
N 

= ^C{i)ej{ii)x{i2) ■ ■ ■ x{id) 
\=1 
d R.{p) 

= XI H C{i)ej{ii)x{i2) ■ ■ ■ x{id) 

p=l i=L(p) 
d /n(p) 

= X XCp(k)wp,(/ci)up2('':2)---Up^(fcd) 

p=l \k=l 

Now suppose that L((7) < j < R((z), j = +r — 1. From (3.10) we must show that 
Vj is the rth component of A(^q)Uq. 

To that end observe that Cp(k)wpj (fci) is necessarily zero unless pi = q, ki = r, 
and p is a permutation of lid. Assuming this to be the case and defining the vectors 
vi, ...,Vdhy 

[ Ui if t 7^ g 



Wq otherwise 



we see using (3.6) that 
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n(p) n(p) 

Cp{k)wp, (fci)up, (fcz) • • • Up, (fed) = -^"^^^ ^^'^^'P^ iki)vp2 (^2) • • • vp^ (fed) 

k=l k=l 



Y ^(k)wi(fci)u2(fc2) • • • Vdikd) 



k=l 



= v'^A(g)Vd ^ 



5 Uq+1 <E) Uq-l ® ■ ■■ ®Ul 



Observe that the number of p that satisfy 1 < p < d subject to the constraint pi = q 
is (d— 1)! and conclude from (3.12) that Wq = /„^(:,r). It fohows that 

d 

p = 1 

This estabUshes (3.10). Equation (3.11) follows from 



(k)Uk 



k=l 



and Lemmas 3.1 and 3.2. □ 



4. Rayleigh Quotients and Stationary Values. Suppose G j^mx xn^ ^^^j 
Uk G IR"'" for A; = 1 Iff. Analogous to (1.3) we define the multilinear Rayleigh Quotient 



0^(Mi,...,Ud) = ^^(i)ui(ii) • • • Ud( 



Id) 



Ul 



Ud 



(4.1) 



i=l 



If C = sym(^), N = rii + ■ ■ ■ riij, and a; G H , then corresponding to (1.4) we have 



(4.2) 



In this section we examine these multilinear Rayleigh quotients, specify their gradi- 
ents, and relate the singular values of A to the eigenvalues of sym(^). 

4.1. The Singular Values of a General Tensor. The gradient of 0^(wi, . . . , w^) 
relates to a collection of matrix-vector products that involve the modal unfoldings of 
A and Kronecker products of the u-vcctors. 

Theorem 4.1. If A e ]R"ix -x"<i and for k = l:d the vectors Uk G M"" each 
have unit 2-norm, then 



V(/)^(wi, . . . ,Wd) 



A(i)Ui 
A{d)Ud 



4>Aiui, . . . ,Wd) 



Ul 



Ud 
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where Uk = (wd ® • • • (8) u^+i Uk-i <8) • • • <^Ui). 
Proof. From Lemma 3.1 we have 

(j)A{ui,...,Ud) = (Ufc ^(fc) Wfc) / (II Ul • • • II II2) • 

For k = l:d we have Vu^i^l A(k) Uk) = A{k) "fc and V„J|| ui \\^ - ■■ \\ Ud Ha) = Uk- 
and so 

^ , _ (H "1 II2 • • - II "rf ll2)Afc)^fc ~ {ulA(k)Uk)uk 

(II ui lla'-'ll Ud W^Y 
= A[k)Uk - 4>a{ui, ■ ■ . ,Ud)uk. 
The theorem foUows by simply "stacking" these subvectors of the gradient. □ 

The variational approach to tensor singular values and vectors set forth in |15) is 
based on equating the gradient of 0^ to zero. 

Definition 4.2. The scalar a ^ ^ is a singular value of a general tensor 
A e irix -x"<i if there are unit vectors Uk € H"*" such that 

A(k)Uk = cTUk, (4.3) 

for k = l:d. The vector Uk is the niodc-A: singular vector associated with a. 

The normalization condition uf^Uk = 1 is necessary, since if Vk = auk for k = l:d then 
A(k)Vk = a''~^.4(fc)Ufe = a^~^(juk = {a^~'^(j)vk for any a S M. It can be shown that at 
least one singular value and associated singular vectors exist for any tensor (cf. |15|V 

4.2. The Eigenvalues of a Symmetric Tensor. For a symmetric tensor C, 
the stationary values of 4'c{x, . . . , x) define the notion of a tensor eigenvalue. 

Theorem 4.3. IfCE j^^x -x^ symmetric and x G has unit norm, then 
V,0c(x, ■■.,x) = d (C(i)x®('^-i) - 0c(x, . . ■ , x)x) . 

Proof. From Lemma 3.2 we have 

Mx,---,x) = x^C^,^x^^''-''>/\\x\f. 

Since 
and 

V^Ax'^xf/^ = d{x'^xY'^-^x 

it follows that 

(x^x)''/2C(i)X®(''-i) - (x^C(i)X®(''-i)) {x'^xYZ-'-^x 



Vx4>cix, ...,x)=d 



{x^xY 

d(C(i)X«('^-i) - [x%k)X^-^'''^)x) 



d 
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completing the proof of the theorem. □ 

By setting the gradient of ipcix^ . . . ,x) to zero we arrive at the notion of a tensor 
eigenvalue [15] . 

that 



Definition 4.4. IfC G Ji^^ "^^ is symmetric andx € is a unit vector such 



C(i)a;®('*-i) = Ax (4.4) 
then X = (jjcix, ■ ■ ■ , x) is an eigenvalue of C and x the associated eigenvector. 

Note that if C(i)a:®('*-i) = Aa; and a G R, then C(i)(aa;)®('^-i) = (a'^-^ X){ax). 
Thus, we resolve a uniqueness issue by requiring tensor eigenvectors to have unit 
length, something that is not necessary in the matrix (d = 2) case. 

In [m [TH] it is shown that eigenvalues and associated eigenvectors always exist 
for symmetric tensors. Recently it has been shown that a symmetric tensor has at 
most {{d — 1)^ — l)/{d — 2) eigenvalues, counted with multiplicity [1]. 

4.3. The Eigenvalues of sym{A). Since C = sym(^) is so structured, we 
anticipate that the eigenvalue-defining equation V0^*""'' (x) = will have some special 
features. From the definitions (4.1) and (4.2) and Theorem 4.3, we have 



v^.7""(i) 



lisyrn) 



(j)cix,...,x)x'j (4.5) 



We first characterize the gradient of in terms of matrix-vector products that 

involve ^'s modal unfoldings. 

Theorem 4.5. If A S andx has unit 2-norm, then 

{ulA(l)Ul)ui 



yJX'"^\x) 



A[i)Ui 






- d 


_ A{d)Ud _ 





{UdA(d)Ud)Ud 



(4.6) 



Proof. From Lemmas 3.1 and 3.3 and the definitions (4.1) and 4.2) we have 



-lj2c{i)x{z^)---x{id) 
(^A{i)ui{ii) ■ ■ ■ u(^id)^ 

ulA(k)Uk 



(II ^ 



iludYi-^ 



for k ~\:d. Since 

V„,(w^ui + •••+ u^d'^df''^ = d-(u\u^ + ■■ 
Since x^ x = u^u^ + ■ • • ujud, we can conclude that 



UdUd) 



^l^-^u, = d- 



Uk- 



^uJX''^\x) 



{x'^xY/^A(k)Uk - d ■ {ulA(k)Uk){x^xY/^-^Uk 



{x^xY 

A[k)Uk - d- {ulA(k)Uk)uk 
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completing the proof of the theorem □ 

It turns out that if the gradient of (j>[^^"^^ (x) is zero, then the vector x generally 
has the property that each subvector Uk has the same norm. 

Corollary 4.6. // V<?!)^^"^ (.x) = and x'^x = 1, then either A(^k)Uk = for 
k = l:d or 



A(d)Ud 



= d-^^J'-^^x) 



Ml 



Ud 



and II Ml II2 = II M2 II2 = • • • = II Ud II2 = l/Vd. 

Proof. Since 'V4>[^^"^\x) = 0, we know from Theorem 4.5 that 

A(k)Uk = d ■ {ulA(k)Uk)uk 



for k^l:d. Thus, 



ulA(k)Uk = d- {ulA(k)Uk){uluk). 



From Lemma 3.1, if u^A(^k)Uk = for some fc, then it is zero for all k. In this case 
we conclude from ()4.6p that A[k)Uk = for k ^ l:d. Otherwise, 1 = duJ^Uk, k ^ l:d. 

It follows that II Ml II 2 = ••• = \\ud\\2 = 1/Vd. □ 

We are now ready for the main result that relates the eigenvalues and vectors of 
sym(^) to the singular values and vectors of A. 

Theorem 4.7. If a is a nonzero singular value of A G ]p;'*i^"^"ti with unit modal 
singular vectors mi, . . . , Ud, then 



1 

Vd 



Q!iUi 
a2U2 

adUd 



II, ±1, 



is an eigenvector for sym(yl) corresponding to eigenvalue 

An 



aiQ:2 



d\ 

■ ad ^^cr- 



-±1] 



Note that ai is set to +1 to resolve a uniqueness issue. See discussion after definition 
4.4 OLi^d also equation (1.2) for the matrix case. 

Proof. We must show that g = V(/)^^™^(xa) =0. If r = ai • • • then for k = l:d 
we have from (4.6) that 



9k 



AkUk 



d(rf+l)/2 



{ulAlk)Uk) Uk 



But since cr = uJ^AkUk and AkUk = ctm^., wc have 
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Since Aq — x'^C(i)Xa^'^ wc have from Lemma 3.3 that 

(t / ai)A{i)Ui 

{T/ad)A(d)Ud 





aiui 










' {d-iy. 












adUd 







k=l 



fc=l 
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d\ 



d<^ 



completing the proof of the theorem. □ 

Thus, for each singular value and vector for A wc have 2"^"^ eigenvalue/eigenvector 
pairs for sym{A). 



4.4. Connections to the Multilinear Transform. Suppose € M' 



and Bk e ]R""^"= for fc = lirf. The tensor Te K 



defined by 



r(i) =5]-F(j)Bi(ii,Zi)B2(j2,»2)---Sfc0fc,Zfc). 



(4.7) 



is the multilinear transform [7] of tensor F by the matrices Bi, . . . , Bd and is denoted 
by 



T = T-iBi,B2,...,Bd). 



We also define 



{Bi,B2,. ■ ■ ,Bd) ■ = ■ {B'I ,BJ , . . . ,Bd). 



(4.8) 



(4.9) 



Some of the key summations and vectors above can be expressed neatly through this 
transformation. For example, ii A £ ]R_"ix -x"<i and Uk G M"'" for k = l:d, then 

n 

A - {ui,...,Ud) = y^^A{i)ui{ii) ■ ■ ■ Udiid) = ulA(i)Ui 



and 



A - {Ui, . . . ,Uk-l,InkiUk+l, ■ ■ ■ ,Ud) = A(k)Uk- 

5. Higher Order Power Methods. We now briefly review various tensor 
power methods and consider them in light of the singular- and eigenvalue connec- 
tion between A and sym(^). 

5.1. The HOPM. The matrix power method method can be generalized to ten- 
sors by replacing the matrix-vector multipication with multilinear transforms. The 
Higher- Order Power Method of [5l |6] for finding a singular value and associated sin- 
gular vectors of general order-d tensors proceeds in an alternating fashion to update 
each of the mode-j singular vectors Uj. 

Different initial values for the Uj vectors will in general result in convergence to 
different singular values. See Section 5.4 for a discussion on popular choices for higher- 
order power method initial values. 
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Algorithm 1 The higher-order power method (HOPM) [5l |6| 
Given an order-d tensor A G K'^x - xrid^ 

Require: uf e R"^ with ^Hs = 1. Let cr(") = {u^°'> f A^i-,u^°\ 



for fc = 0, 1, . . . do 
for j = 0, 1, . . . do 

uf-'^ ^ A^A'^'^ ® • • • ® "5+"^'^ ® 1 ® • • • ® 4'^ 

end for 
end for 



The HOPM can also be viewed as a way of finding the best rank-1 tensor approx- 
imation ^ to v4 [5]. Specifically, a tensor TG ]FL"i^ "^"<i is said to be rank-1 if for 
k = l:d there exist vectors ti E M"' such that for all i = 1, . . . , n 

Tii)=ti{h)t2it2)---td{td) (5.1) 
and we then say that T is the tensor outer product of the vectors ti, . . . ,td, denoted 

by 

T = ti o t2 o ■ ■ ■ o td- (5-2) 

It can be shown that the HOPM converges to a local minimum of the functional 
f{A) = II .4 — v4 II j^, where A = <t ui o ■ ■ ■ o is a rank-1 approximation to A and 

the Frobenius norm of a tensor T is defined as || T ||^ = \JY1^=i "Ti^Y- See [TT] , 

The HOPM can be applied to an order-d symmetric N x ■ ■ ■ x N tensor, starting 
with a symmetric initial guess u^"^ — Uj"'' = • • • = u^^'^ G M^. The solution found 
by the algorithm will be symmetric but intermediate results may break symmetry. 

(k) 

Indeed, after one iteration the uj vectors will in general all be distinct, but Uj ^ u 
as /c — > oo for some u G 5 . 

5.2. TheS-HOPM. Recentlv. [TU] investigated a modified version of the HOPM 
for symmetric tensors which was originally dismissed by [5] as unreliable since in gen- 
eral it is not guaranteed to converge. This algorithm is called the Symmetric Higher 
Order Power Method (S-HOPM) and converges for certain classes of symmetric ten- 
sors. For example, suppose C is a symmetric tensor of even order and that M is a 
square unfolding of C. If M is semidefinite then the S-HOPM converges [TO]. 

Algorithm 2 Symmetric higher-order power method (S-HOPM) [3 [10] 
Given an order-c? symmetric tensor C G R^x -x^. 

Require: x^°^ G with ||x(o)||2 = 1. Let = (x(o))i^C(i)(a;(o))®(''-i). 
1: for fc = 0, 1, . . . do 

2: x^'^+i) ^C(i)(a;W)®('*-i) 

3: ^ X^'^+l) /p^'^+l) II2 

4: A(^+l) ^ (x(^-+l))^C(i)(x('=+l))®(''-l) 

5; end for 



This approach avoids the awkward situation, mentioned previously, of encountering 
non-symmetric intermediate values when using the HOPM on a symmetric tensor. 
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Since sym(^) is symmetric for any tensor A, the S-HOPM can be apphed to 
A through its embedding. By using facts previously estabhshed, wc can reduce all 
operations on sym(^) to equivalent ones on A. 



Algorithm 3 Symmetric higher-order power method on sym(^) 
Given an order-d tensor A G M"i>< "^"d. 

Require: uf^ G M"^ with ||^°^||2 = 1. Let a*") = (uf^)^ A(^i)tL[°\ 
1: for fc = 0, 1, . . . do 
2: for j = 0, 1, . . . do 

3: uf^'^^A^^uf 

5: end for 

6: ^ {uf+'^rA^,)u['"''^ 

7: end for 



This algorithm computes a singular value a for A and the mode-j singular vec- 
tors Uj. The normalization used in Algorithm 3 is slightly different than a di- 
rect application of the S-HOPM on sym(^) would imply; the S-HOPM would set 

(fc+l) _ .(fc+l) , /||-(fc+l)||2 , ... , |U-,(fc+l)||2 



Uj = ""j /yll"! ll2 + ''' + ll^d Hi- However, numerical experiments sug- 
gest that using Uj''^^^ = u''^'^^^^ /\\uj''^^''\\ improves convergence. If A is itself symmet- 
ric, then Algorithm 3 reduces to the S-HOPM as all the uj will be equal, assuming 

(0) (0) 

Note that Algorithm 3 is very similar to the regular HOPM except the most 
recently available information on ui, . . . , itj_i is not used when computing wj'^^^^ for 
J > 1. The difference between the HOPM and Algorithm 3 is thus somewhat like the 
difference between the Jacobi and Gauss-Seidel iterative linear system solvers [8]. 

Unlike the HOPM, Algorithm 3 does not always converge and since it can be 
shown that a square unfolding of sym(^) is indefinite unless all the entries in A are 
zero, the convergence criteria in |10j do not apply. 

5.3. The SS-HOPM and sym(-). Recently, Kolda and Mayo [13] developed 
a shifted version of the S-HOPM and proved that for a suitable choice of shift their 
algorithm will converge to an eigenpair (A, x) for any symmetric tensor C. 

Algorithm 4 Shifted symmetric higher-order power method (SS-HOPM) [T5] 
Given an order-d symmetric tensor C € K^^x -x^. 

Require: G with ||a;(")||2 = 1. Let A^") = (a;("))^C(i)(a;(o))«5('i-i). 
1: for fc = 0, 1, . . . do 

2: xC^+i) +- C(i)(xW)®('^-i) + aca;W 

3: a^^'^+i) ^x('=+i)/||x('=+i)||2 

4: A^'^+i) +- (a;('=+i))^C(i)(2;('^+i))®('*-i) 

5: end for 



If the shift ac satisfies \ac\ > (d - 1) Ei=i|C(i)| then the SS-HOPM will converge to 
an eigenpair |13j . 

When C = sym(^) the algorithm can be simplified and expressed in terms of 
operations on A. 
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Algorithm 5 Shifted symmetric higher-order power method on sym(^) 
Given an order-d tensor A G K'^x - xrid^ 

Require: uf^ e K"^ with Wufh = 1/Vd. Let a^^^ = (u'i^)^ A(^i-)u'i\ 
1: for fc = 0, 1, . . . do 
2: for j = 0, 1, . . . do 

.(fe+l) , ~(k) , J (k) 

3: u) ' ^ +daAu) ' 

4; end for 

5; for j = 0, 1, . . . do 

R , /, /||,",('=+l)|j2 , , II -(fe+l)||2 

6: Uj ^Uj /\J\\U^ \\2-\ h II2 

7: end for 
9: end for 



Using Theorem 4.7, a simple normalization of the values returned by this algorithm 
gives a singular value and associated unit norm singular vectors of A. 

The shift aA must satisfy |a^| > (rf — 1) to guarantee convergence, 

although a smaller shift might be sufficient for any particular tensor A. 

Example. Let A be the 2x2x2x2 tensor given by the unfolding 

1.1650 0.2641 -0.6965 1.2460 0.0751 -1.4462 0.0591 0.5774 " 
0.6268 0.8717 1.6961 -0.6390 0.3516 -0.7012 1.7971 -0.3600 ' 

We ran 100 trials of the HOPM, Algorithm 3 and Algorithm 5 using different random 
starting points uf'^ chosen from a uniform distribution on [—1, 1]"' and suitably nor- 
malized for each algorithm. The algorithms are considered to have converged when 
l^(fc-i-i) _ fj'-*^'! < 10"^^. For this example, all three algorithms converged for every 
starting point. 

The HOPM found the singular values 2.7248 and 1.7960. Algorithm 3 converged 
to (J = ±2.7248. Algorithm 5 with a positive shift found 2.7248 and 1.7960 and 
using a negative shift produced the values —2.7248 and —1.7960. 

For this tensor A the theory suggests a shift a a greater than 37.72 in absolute 
value to guarantee convergence. However, using ua as small as 1 will still lead to 
convergence and does so in many fewer iterations, sometimes by as much as a factor 
of 30 when compared to the suggested shift. Setting aA to zero caused the algorithm 
to fail to converge for all chosen starting points. 

5.4. Initialization. A standard way to initialize higher-order power methods is 
to use a truncated form of the Higher-Order Singular Value Decomposition (HOSVD) 

of m, 

A^iUuU2,...,Ud)-S. (5.3) 

where S S jiyH x - xn<i ^^^q Qore tensor, the Ui G M">^"> are orthogonal and related 
to the modal unfoldings of A through the matrix SVD equations A(^i^ ~ UiJ^iVl^ ■ 

To initialize the HOPM, for example, the values uj"'' = Uj{:, 1) have been shown 
[5] to often lie close to the best rank-1 approximation to A. 

If desired, it is possible to create the HOSVD of C = sym(^) from the HOSVD 
of A. For example, if C = {Uc, • ■ • , Uc) ■ Sc is the HOSVD of C then it can be shown 
that Uc is a column permutation of the block-diagonal matrix diag([/i, U2, ■ ■ ■ , Ud)- 
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There are many other ways to initialize tensor power methods. In [TTl Rcgaha and 
Kofidis derive a procedure for symmetric tensors that can outperform the HOSVD- 
based approach. 

Another possibihty is to compute a tensor generahzation of the QR decomposition 
with partial pivoting, of the form A ~ • where A(k) ~ QkRk^k 

are the pivoted QR decompositions of the unfolding .4(j.). It can be shown that 
this "HOQRD" decomposition retains some of the approximation properties of the 
truncated matrix pivoted QR decomposition and can thus give a reasonable initial 
guess for a tensor power method. As for the HOSVD, the HOQRD of sym(^) can 
be constructed from the HOQRD of A. 

6. Tensor Rank and the sym Operation. There are several definitions of 
tensor rank, each of which represents some reasonable generalization of matrix rank. 
For an excellent review see [7]. In this brief section we relate the multilinear rank 
and the outer product rank of sym(^) to the multilinear rank and the outer product 
rank of A. 

6.1. Multilinear Rank. The multilinear rank of ^ e j^nix - xn^ ^-^^ d-tuple 
rankH(^) = (ri(^), r2(^), . . . , where ri{A) = va,-ak{A(i)) . Note that if the 

tensor C G ]FL^^ is symmetric, and R = rank(C(i)), then 

rankH(C) = {R,R,...,R) (6.1) 

because C(i) = C(2) = •••=: Cj^). If C = sym(^), then it is possible to connect 
rankH(C) to rankH(y^). 

Theorem 6.1. If A e ''''''''' and ra.nka{A} = (ri, . . . ,rd), then 
rankH(sym(^)) ^ {R, . . . , R) 

where R = ri + ■ ■ ■ + ra- 

Proof. Suppose C = sym(^) and Ci is C's ith block, 1 < i < d. Let C^'^^ be a 
1 X d X ■ ■ ■ X d block tensor defined by 

= Ci 

where ii ~ k and 1 < ij < d for j ~ 2:d. Note that if i(2:d) is a permutation of 
[l:fc- 1 k + l:d], then 

^ _^<[fci(2:d)]> 

It follows that 

range(c[f)^) = range(^(fc)) k = l:d (6.2) 

If 

}ni 
}nd 

is a block row partitioning of C(i), then Ck is a column permutation of C^'J^-j* and so 
using (6.2) we have 

rank(Cfe) = rank(cf,^h = r^. (6.3) 



(1) 
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If 



}ni 



}nd 



is a column of C(i) then it is a a mode-l fiber of C and thus can "pass through" at 
most one C-block having an index that is a permutation of l:d. This means that at 
most one of v's subvectors is zero. It follows from (6.3) that 

d d 

rank(C(i)) = ^rank(Cfc) = ^ 
fc=i fe=i 

completing the proof of the theorem. □ 

6.2. Outer Product Rank. The outer product rank of A e is the 



minimum number of rank-1 tensors, as defined in ()5.ip and (|5.2p . that are needed to 
represent it as a sum 



rank^ (A) — min <.r:A^ Ui 



O Un O ■ ■ ■ O U 



For matrices rank(sym(A)) = 2 rank(yl). Indeed, If A = ^21=1 '^i'^^i'^I is the SVD of 
A, then 



sym(A) =^0-1 





Vi 



uf 0] + 







[0 



Motivated by this expansion we make a definition. 

Definition 6.2. If T € ]p{"ix -x"<i is the rank-1 tensor T ^ ti o ■ ■ ■ o td anc 
N ~ ni + ■ ■ ■ Ud, then S G ^^^ "^^ ig ifig rank-1 tensor 



S = 7r(r) = Si 



o Sd 



whe 



Sk = 





tk 




}ni-\ h Uk-i 

}nk 

} nk+i H h rid 



With this construction, we can produce an outer product expansion of sym(^) given 
an outer product expansion of A. 



Theorem 6.3. IfAeM 



,ni X ■ ■ ■ X 7lfi 



(1) (2) (d) 
U: o u] O ■ ■ ■ O U- 



where ■ ■ , ul''^ £ K"*" , then 



sym(yl) = ^ ^ ^(wf^ ° ^ 0---0 w(''))<P> 



(6.4) 
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where Sd is the set of all permutations of l:d. 

Proof. Let C be the sum on the right side of (6.4) and note that 

We must show that the qth block of sym(^) equals the qth block of Cq. If q is not 
a permutation of ltd, then these blocks are both zero. Otherwise 

i = l 

completing the proof of the theorem. □ 

Since the double summation in (6.4) involves rd\ terms, it follows that 

rank^(sym(^)) < d!-rankg(^) (6.5) 

We conjecture that equality prevails. This is somewhat reminiscent of the direct sum 
conjecture [H], i.e. that rank^(^© B) = rank^(^) +rank^(B). Intuitively, sym(^) 
contains d\ distinct copies of A in nonoverlapping index regions so if the matrix case 
were to generalize, any expansion of sym(^) into a sum of < d!r rank-1 terms could 
be reduced to (6.4) without adding terms, thus having exactly d\r terms. We have so 
far been unable to prove this. Note that it can be shown that 

(i-rank®(^) < rank0(sym(^)) 

using Lemma 3.5 in [7]. 

7. Conclusions. The symmetrization sym(^) can be used to connect algo- 
rithms for symmetric tensors and ones for general tensors. In this paper we have 
shown how algorithms such as the S-HOPM and SS-HOPM give rise to non-symmetric 
algorithms through the symmetrization in a way that preserves many convergence 
properties. In particular, the non-symmetric version of the SS-HOPM we derive is 
guaranteed to converge for an appropriately chosen shift a^- Are there other tensor 
methods where the symmetrization could be used to spot new connections or derive 
useful algorithms? 

The rank properties of the symmetrization in some ways mirror the matrix case, 
but fundamental questions regarding the outer product rank of sym(^) remain open. 
Resolution of these questions may help bridge the conceptual gap that exists between 
matrix rank and tensor rank. 
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